Summary
Feature tracks used in characterization of DNA methylation in the Pacific oyster genome [version oyster.v9_90 (
fasta)]
This version of the genome represents l
ongest genomic scaffolds (1670; 14%) that cover over 90% of genome.
Derived from genome build available at
Zhang, G; Fang, X; Guo, X; Li, L; Luo, R; Xu, F; Yang, P; Zhang, L; Wang, X; Qi, H; Zhu, Y; Yang, L; Huang, Z (2012) Genomic data from the Pacific oyster (Crassostrea gigas). GigaScience.
http://dx.doi.org/10.5524/100030
[Track] oyster.v9_90 all CGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/Bedtools_Intersect/oyster.v9_90_allCGs
Preview:
scaffold22 fuzznuc misc_feature 69 70 2.000 + . Sequence "scaffold22.1" ; note "*pat pattern1"
scaffold22 fuzznuc misc_feature 73 74 2.000 + . Sequence "scaffold22.2" ; note "*pat pattern1"
scaffold22 fuzznuc misc_feature 93 94 2.000 + . Sequence "scaffold22.3" ; note "*pat pattern1"
scaffold22 fuzznuc misc_feature 156 157 2.000 + . Sequence "scaffold22.4" ; note "*pat pattern1"
scaffold22 fuzznuc misc_feature 191 192 2.000 + . Sequence "scaffold22.5" ; note "*pat pattern1"
scaffold22 fuzznuc misc_feature 240 241 2.000 + . Sequence "scaffold22.6" ; note "*pat pattern1"
Description:
fuzznuc on oyster.v9_90 fasta file.
[Track] Methylated CpGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/MethylatedCG_BED.bed
Preview:
scaffold1 263 263 CG 0.300 +
scaffold1 267 267 CG 0.100 +
scaffold1 9470 9470 CG 0.188 +
scaffold1 18706 18706 CG 0.071 +
scaffold1 20215 20215 CG 0.077 +
Description:
BSMAP used to map PE Bisulfite Illumina Reads from sperm sample
c1= scaffold, c2= start, c3= end, c4= motif, c5= percent methylation, c6= strand
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/Mollusk/BSMAPoutput_174gm_v9_90.sam -p 8
python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/Mollusk/methratiopython_174gm_v9_90.txt -s /Users/Shared/Apps/bsmap-2.73/samtools -z -u /Volumes/web/Mollusk/174gm_analysis/BSMAPoutput_174gm_v9_90.sa
Total number of aligned reads:
total 145949462 valid mappings, 123681367 covered cytosines, average coverage: 11.86 fold
[Track] Unmethylated CpGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/NoMethCG_BED.bed
Preview
scaffold1 64 64 CG 0.000 +
scaffold1 128 128 CG 0.000 +
scaffold1 10530 10530 CG 0.000 +
scaffold1 10569 10569 CG 0.000 +
scaffold1 11745 11745 CG 0.000 +
Description:
[Track] Methylated CpGs
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_METHbed.txt
Preview
scaffold1 9470 9470 CG 0.162 +
scaffold1 16825 16825 CG 0.067 +
scaffold1 18706 18706 CG 0.077 +
scaffold1 20215 20215 CG 0.071 +
scaffold1 20756 20756 CG 0.600 +
Description:
BSMAP used to map PE Bisulfite Illumina Reads from sperm sample
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam -p 8
Total number of aligned reads:
pairs: 85147571 (50%)
single a: 16704916 (9.7%)
single b: 15703005 (9.2%)
python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -u -p -q -z -o /Volumes/web/whale/ce_bs/OUT_methratio_gonadPE_v9_90_B.txt -s /Users/Shared/Apps/bsmap-2.73/samtools /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam
Trimmed output (Galaxy) to have data for only CGs, positive strand, and 10x coverage
Specifically, methratio output ran through this workflow in Galaxy:
http://eagle.fish.washington.edu/cnidarian/Galaxy-Workflow-methratio_processing_BED.ga
[Track] Unmethylated CpGs
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_NOmethbed.txt
Preview
scaffold1 13612 13612 CG 0.000 +
scaffold1 13822 13822 CG 0.000 +
scaffold1 13936 13936 CG 0.000 +
scaffold1 13967 13967 CG 0.000 +
scaffold1 14032 14032 CG 0.000 +
Description:
Second product of Galaxy Workflow above.
[Track] Methylation Clusters 10-4
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_10_4_v9_90_bed.bed
Preview
scaffold43964 14454 14480
scaffold43964 49922 49938
scaffold43964 52400 52420
scaffold43964 58274 58294
scaffold43964 128548 128573
Description:
Based on above track. Intervals where the maximum distance between mCpG is 10bp, and the minimum # of mCpG is 4. Includes 9974 features.
[Track] Methylation Clusters 50-5
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_50_5_v9_90_bed.bed
Preview
scaffold43964 3034 3111
scaffold43964 14417 14480
scaffold43964 25418 25493
scaffold43964 39824 39905
scaffold43964 49292 49349
Description:
Based on above track. Intervals where the maximum distance between mCpG is 50p, and the minimum # of mCpG is 5. Includes 40890 features.
[Track] Methylation Clusters 100-4
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_100_4_v9_90_bed.bed
Preview
scaffold43964 3034 3111
scaffold43964 14221 14291
scaffold43964 14417 14689
scaffold43964 19154 19187
scaffold43964 24969 25108
Description:
Based on above track. Intervals where the maximum distance between mCpG is 100p, and the minimum # of mCpG is 4. Includes 79161 features.
Corresponding fasta:
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_100_4.fa
[Track] Repeats
http://eagle.fish.washington.edu/cnidarian/rm_020713/oysterv9_90.fa.out.gff
Preview
scaffold1 RepeatMasker similarity 9873 9897 0.0 + . Target "Motif:AT_rich" 1 25
scaffold1 RepeatMasker similarity 12513 12553 0.0 + . Target "Motif:(GA)n" 1 41
scaffold1 RepeatMasker similarity 16199 16242 18.2 + . Target "Motif:AT_rich" 1 44
scaffold1 RepeatMasker similarity 16261 16334 21.6 + . Target "Motif:AT_rich" 1 74
scaffold1 RepeatMasker similarity 16494 16522 3.5 + . Target "Motif:AT_rich" 1 29
Description
RepeatMasker with
Repbase;
Summary table @
http://eagle.fish.washington.edu/cnidarian/rm_020713/oyster.v9.fa.tbl
[Track] Transposable Elements
http://eagle.fish.washington.edu/cnidarian/TJGR_TE_oysterv9_90.gff
scaffold999 TRF Tandem_Repeat 166754 166792 69 + . .
scaffold1 TRF Tandem_Repeat 12513 12553 82 + . .
scaffold1259 WUBlastX MuDR1x_AP 15516 15635 50 - . DNA
scaffold1327 WUBlastX Zator-3_AAe 333539 334297 105 - . DNA
scaffold1627 WUBlastX Zator-3_AAe 151603 151785 32 + . DNA
Description
RepeatProteinMask
[Track] CDS
http://aquacul4.fish.washington.edu/~steven/armina/oyster.v9.glean.final.rename.CDS.gff
Preview
scaffold980 GLEAN CDS 134604 134778 . + 2 Parent=CGI_10019211;
scaffold980 GLEAN CDS 141499 141593 . + 1 Parent=CGI_10019211;
scaffold980 GLEAN CDS 142711 142811 . + 2 Parent=CGI_10019211;
scaffold980 GLEAN CDS 143780 143896 . + 0 Parent=CGI_10019211;
scaffold980 GLEAN CDS 144887 145029 . + 0 Parent=CGI_10019211;
[Track] Introns
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/oysterv9_90_Introns.bed
Preview
scaffold22 8845 13192
scaffold22 13237 14157
scaffold22 14229 15108
scaffold22 15180 15773
scaffold22 19018 19239
Description
[Track] Introns divided into 50bp windows
http://eagle.fish.washington.edu/cnidarian/oysterv9_90_Intron_50pbWindows.bed
Preview
scaffold22 8845 8895
scaffold22 8895 8945
scaffold22 8945 8995
scaffold22 8995 9045
scaffold22 9045 9095
[Track] mRNA
http://aquacul4.fish.washington.edu/~steven/armina/oyster.v9.glean.final.rename.mRNA.gff
Preview
scaffold6 GLEAN mRNA 684420 688461 0.811719 + . ID=CGI_10022332;
scaffold6 GLEAN mRNA 694464 700813 0.235103 + . ID=CGI_10022333;
scaffold6 GLEAN mRNA 701995 741494 0.270237 + . ID=CGI_10022334;
scaffold1710 GLEAN mRNA 22769 26100 0.999946 + . ID=CGI_10022335;
scaffold1710 GLEAN mRNA 66509 80594 0.877603 + . ID=CGI_10022336;
[Track] Promoter Region:
http://eagle.fish.washington.edu/cnidarian/TJGR_genes_v9_promoter_5p1000.gff
Preview
scaffold40150 GLEAN promoter 53687 54687 0.999676 - . ID=CGI_10003906;
scaffold40150 GLEAN promoter 61510 62510 0.998077 - . ID=CGI_10003907;
scaffold40150 GLEAN promoter 82433 83433 1 - . ID=CGI_10003910;
scaffold1177 GLEAN promoter 70856 71856 0.889891 - . ID=CGI_10003913;
scaffold40178 GLEAN promoter 50250 51250 0.999219 - . ID=CGI_10003915;
[Track] NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window.bed
scaffold22 0 50
scaffold22 50 100
scaffold22 100 150
scaffold22 150 200
scaffold22 200 250
Description: Complement of CDS interval in 50bp windows
[Track] +100x Mgo Expression - NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window_100xMgo.bed
scaffold1 100112 100162 121
scaffold1 100162 100212 118
scaffold1 100212 100262 106
scaffold100 80833 80883 4279
scaffold100 82089 82139 555
Description: bedtools | coveragebed using bam tophat output (-a) and NonCDS 50bp window bed (-b). split option.
[Track] +20x Mgo Expression - NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window_20xMgo.bed
Preview
scaffold1 21144 21194 20
scaffold1 23024 23074 27
scaffold1 23074 23124 30
scaffold1 23124 23174 26
scaffold1 23174 23224 23
Description: bedtools | coveragebed using bam tophat output (-a) and NonCDS 50bp window bed (-b). split option.
[Track] SNPs Mgo RNA-seq Tophat
http://eagle.fish.washington.edu/cnidarian/TJGR_MgoSNP_vcf_to_gff.gff
Preview
scaffold1 SAMTools SNP 18600 18600 33.8 . . REF=C;ALT=T;FILTER=.;INFO=DP%3D2%3BVDB%3D0.0160%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C2%2C0%3BMQ%3D50%3BFQ%3D-33;FORMAT=GT:PL:GQ;SAMPLE=1/1:65%2C6%2C0:10
scaffold1 SAMTools SNP 18913 18913 4.77 . . REF=A;ALT=C;FILTER=.;INFO=DP%3D1%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C1%2C0%3BMQ%3D50%3BFQ%3D-30;FORMAT=GT:PL:GQ;SAMPLE=0/1:33%2C3%2C0:3
scaffold1 SAMTools SNP 21342 21342 117 . . REF=T;ALT=A;FILTER=.;INFO=DP%3D31%3BVDB%3D0.0445%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C23%2C5%3BMQ%3D50%3BFQ%3D-111;FORMAT=GT:PL:GQ;SAMPLE=1/1:150%2C84%2C0:99
scaffold1 SAMTools SNP 21381 21381 222 . . REF=G;ALT=A;FILTER=.;INFO=DP%3D37%3BVDB%3D0.0394%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C32%2C5%3BMQ%3D50%3BFQ%3D-138;FORMAT=GT:PL:GQ;SAMPLE=1/1:255%2C111%2C0:99
scaffold1 SAMTools SNP 23620 23620 165 . . REF=A;ALT=T;FILTER=.;INFO=DP%3D16%3BVDB%3D0.0440%3BAF1%3D0.5%3BAC1%3D1%3BDP4%3D2%2C6%2C0%2C8%3BMQ%3D50%3BFQ%3D168%3BPV4%3D0.47%2C0.42%2C1%2C1;FORMAT=GT:PL:GQ;SAMPLE=0/1:195%2C0%2C254:99
Description:
SNPs identified in bam file from Mgo Tophat alignment
Track
Other files
GO Analyses:
http://eagle.fish.washington.edu/cnidarian/TJGR_Gene_GO_GOslim.txt
MBD-bisulfite seq data (gill tissue):
*data used in multivariate class*
http://128.95.149.81/bivalvia/All%20data%201059%20genes.xls
http://eagle.fish.washington.edu/cnidarian/MG_alldata1059.txt
. CG in exon CG intron CG total %CG in MBD exon %CG in MBD intron %CG in MBD total . Dgl Fgo Gil Amu Hem Lpa Mgo overall abundance abundCV . cell adhesion cell cycle and proliferation cell organization and biogenesis cell-cell signaling death developmental processes DNA metabolism other biological processes other metabolic processes protein metabolism RNA metabolism signal transduction stress response transport . cell adhesion cell cycle and proliferation cell organization and biogenesis cell-cell signaling death developmental processes DNA metabolism other biological processes other metabolic processes protein metabolism RNA metabolism signal transduction stress response transport
CGI_10026228 16 11 27 0.025316456 0 0.022222222 CGI_10026228 0 0 0 0 0 0 0.071608608 0.010229801 264.5751311 CGI_10026228 1 CGI_10026228 N N N N N N N N N N N N N Y
CGI_10026611 22 0 22 0 0.035294118 0.03 CGI_10026611 0 0 0 0.099655867 0 0 0 0.014236552 264.5751311 CGI_10026611 1 CGI_10026611 N N N N N N N Y N N N N N N
CGI_10027943 42 11 53 0 0.126213592 0.087837838 CGI_10027943 0 0 0 0.0751982 0 0 0.048001375 0.017599939 176.5122486 CGI_10027943 1 1 CGI_10027943 N N N N N N Y Y N N N N N N
Methods: DNA methylation data
Analyses are based on the results of high-resolution methylation analysis of genomic DNA from pooled oyster gill tissue (n=8). Briefly, genomic DNA was isolated and methylation enrichment performed using the MethylMiner Kit (Invitrogen) following the manufacturer’s instructions. A bisulfite treated DNA library of the methylation-enriched fraction was prepared for Illumina Sequencing at the University of Washington high throughput sequencing facility (Seattle, WA). High-throughput reads were mapped back to a subset of the oyster genome which included scaffolds longer than 1million bp (Zhang et al, 2012). Mapping of the bisulfite treated reads was performed using BS-MAP software (version 2.73). Cytosines in a CG dinucleotide context with greater than 5x coverage in the MBD library were considered to be methylated if at least one of the reads remained unconverted by the bisulfite treatment. One thousand fifty-five oyster genes were evaluated for further analysis of methylation and other gene attributes. Genes were selected if at least 1 CG dinucleotide had 5x coverage in the MBD library and were further limited to genes that were expressed in at least 1 of 6 oyster tissues based on the dataset of Zhang et al (2012). Proportion of methylation for a given gene was calculated by dividing the number of methylated cytosines by the total number of CG dinucleotides in the sequence. The proportion of methylation for exonic regions and intronic regions were also calculated per gene.
[Track] MethylKit analysis results - sperm methylation as Compared to gill by individual CG
http://eagle.fish.washington.edu/bivalvia/files%20for%20methylKit/diffmeth_bytissue_allCG_v9_90.txt
Preview
"","id","chr","start","end","strand","pvalue","qvalue","meth.diff"
"1","scaffold1.105280","scaffold1",105280,105280,"+",1,0.72307975717804,8.33333333333333
"2","scaffold1.105289","scaffold1",105289,105289,"+",1,0.72307975717804,7.14285714285714
"3","scaffold1.154709","scaffold1",154709,154709,"+",0.0019663626474772,0.00796012720063931,-46.1538461538462
"4","scaffold1.154924","scaffold1",154924,154924,"+",1,0.72307975717804,5.95238095238095
Description: Intervals are all CG with 10x coverage that were analyzed in both gill and sperm. Both tissues were mapped to oyster_v9_90. Results of methylkit analysis gives p-value, q-value and %difference in methylation. The sperm is considered the 'control' in this analysis, so positive values in the meth.diff column indicate higher methylation in the sperm.
[Track] MethylKit analysis results - sperm methylation as Compared to gill by 100bp tile
http://eagle.fish.washington.edu/bivalvia/diffmeth_bytissue_100bptile_v9_90.txt
Preview:
id chr start end strand pvalue qvalue meth.diff
1 scaffold1.105201.105300 scaffold1 105201 105300 * 0.544685352 0.57355062 7.317073171
2 scaffold1.130301.130400 scaffold1 130301 130400 * 4.99E-08 3.13E-07 -100
3 scaffold1.154701.154800 scaffold1 154701 154800 * 0.00013902 0.000525279 -44.69026549
4 scaffold1.154901.155000 scaffold1 154901 155000 * 0.647937411 0.631058554 9.523809524
5 scaffold1.155601.155700 scaffold1 155601 155700 * 2.39E-175 6.58E-172 -91.27868169
Description:
Intervals are all 100bp tiles that were analyzed in both gill and sperm. Both tissues were mapped to oyster_v9_90. Results of methylkit analysis gives p-value, q-value and %difference in methylation. The sperm is considered the 'control' in this analysis, so positive values in the meth.diff column indicate higher methylation in the sperm.
Summary Statistics for Gill RNAseq coverage on CDS, grouped by gene
http://eagle.fish.washington.edu/cnidarian/TJGR_Gil_cov_CDS_stats_cv.txt
Preview
CGI_10011974 3.676751918 12 0.222693761 0.306395993 0.471904398 0 1.47826087 154.0178099
CGI_10014715 12.11476909 17 1.205523862 0.712633476 1.097963507 0 2.951219512 154.0712784
CGI_10021734 0.050157776 7 0.000121899 0.007165397 0.011040788 0 0.03125 154.0848123
CGI_10015964 0.028011204 5 7.45E-05 0.005602241 0.008633633 0 0.019607843 154.1103512
CGI_10004322 65.58826056 6 283.8299465 10.93137676 16.84725338 0 41.6056338 154.1183124
Description:
Columns:
ID sum CDScount var avg stdev min max cv